System for recognition of hand-written characters

ABSTRACT

A document contains zones in which to write at least one character per zone, and directions as to the information that is required in the zones. The directions are printed in the zones such that the written characters constituting the required information overlap the printed directions.

TECHNICAL FIELD

The present invention relates to the recognition of handwrittencharacters, and in particular a process and a system for recognizinghandwritten characters on documents of the type of forms to be filledin.

STATE OF THE ART

Modern techniques involving computers nowadays permit the automaticreading of handwritten documents under conditions in which the cost isvery much greater than the cost of a manual operation.

However, automatic processing of handwritten documents is possible withhigh efficiency only if documents of the form type are used which havebeen pre-printed by first filling in boxes which are to be read.

Pre-printing requires the user, or writer, who fills in a document ofthe form type, to properly position the characters and to write legiblya character in each box. In each of these cases, the writer is asked towrite a letter (generally a capital letter), a number or an X.

The written document is then read by an electro-optical detector whichgenerally delivers a signal which can have two levels, a first levelcorresponding to the shade of the handwritten characters, and a secondlevel corresponding to the color of the paper in the boxes. An imageprocessing means then carries out a recognition of the handwrittencharacters located in the boxes and causes to correspond to each box abinary series of data according to known techniques.

During processing, it is obviously necessary to separate or eliminatethe characters previously printed on the document. Several processeshave been used to carry out this elimination, and in particular the useof non-actinic printing. This process however has drawbacks, inparticular the requirement of printing the material in at least twocolors.

Another process not having these drawbacks is described in French patentapplication 95 10031. The process described in this patent applicationconsists in using a form containing printed boxes in which are to behandwritten characters adapted to be read by an optical recognitiondevice, the boxes being predefined by predetermined motives constitutingelements characterized by at least one parameter verifying apredetermined relationship so as to be able easily to eliminate, duringreading of the document, the objects as to which the parameters verifythe predetermined relationship as being constituent elements of thepredetermined designs.

Although effective, this process requires however that the form compriseon its face or below the boxes to be filled in the nature of theinformation to be to be written in these boxes, such as “NAME”, “GIVENNAME”, “ADDRESS”, etc. This printing, which is necessary so that thecorrect information will be written by the writer in the appropriateboxes, obviously has the principal drawback of occupying a substantialsurface of the form and hence wasting paper.

SUMMARY OF THE INVENTION

This is why the principal object of the invention is to provide a systemof handwritten character recognition using a form in which no surface isused to write in information relative to the boxes to be filled.

The principal object of the invention is thus a process for handwrittencharacter recognition in a document of the form type containing printedboxes as well as pre-inscribed characters within the boxes in whichcorresponding characters are to be handwritten according to theindications supplied by the pre-printed characters, the printed boxesand the pre-printed characters being constituted by predeterminedelements characterized by at least one parameter verifying apredetermined relationship whilst the predetermined relationship is notverified for the handwritten characters. The process comprises the stepsof reading the document by successive elemental zones with the help of acharacter recognition device, determining that the parameter orparameters characterizing a read object in a set of elemental zonesverifies the predetermined relationship, and eliminating the objects forwhich the predetermined relationship is verified as being predeterminedelements.

Another object of the invention is the provision of a document of theform type containing printed boxes in which to handwrite charactersadapted to be read by an optical recognition device, the boxescontaining pre-printed characters supplying indications to the user towrite therein handwritten characters as well as printed boxes that thepre-printed characters are constituted of predetermined elements,characterized by at least one parameter verifying a predeterminedrelationship whilst the predetermined relationship is not verified forthe handwritten characters, so as to be able easily to eliminate, duringreading of the document by an optical recognition device, the objectswhose parameter or parameters verify the predetermined relationship asbeing predetermined elements.

BRIEF DESCRIPTION OF THE FIGURES

The aims, objects and characteristics of the invention will becomebetter apparent from a reading of the description which follows, givenwith reference to the accompanying drawings, in which:

FIG. 1 shows a portion of a form used in the prior art;

FIG. 2 shows a portion of a form using the principles of the invention;and

FIG. 3 shows schematically a device for character recognition used forreading the form according to the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

A conventional form is generally as shown in FIG. 1. The reader mustfill the boxes following the printed directions for each group of boxes.Thus, he must write in his name in the boxes of the first line 10,generally in capital letters, then his given name in the boxes of thesecond line 12. In the third line 14, he must indicate his age bywriting in two digits. In certain lines such as line 16 indicated “SEX”,he must place an X in one box, either that designated “M” (for male), orthat designated “F” (for female).

As will be seen, it is necessary to provide on this from regionscontiguous to the boxes (as in FIG. 1) or below the boxes, to indicatethe information to be written into the boxes such as the name, givenname, age, sex, etc.

The invention avoids using up a surface of the form to print thedirections mentioned above, by using first of all the process formingthe object of the patent application 95 10031. In other words, the boxes10, 12, 14 or 16 are constituted of elements which are characterized byone or parameters verifying a predetermined relationship, such as thesize of the elements, their surface, their position, for spacing betweentwo elements or else parameters characteristic of the shape of theelements.

At the time of reading by the character recognition device, theprocessing carried out consists in eliminating the elements of thedesign of each box when there is detection of the parameters verifyingsaid predetermined relationship, by recognizing the character locatedthere.

By using this concept, the invention consists in printing theindications needed to help the writer, directly in the boxes and notfacing or below the boxes as was the case previously, and by using thesame principals as those described in patent application 95 10031. Theindications such as those shown in FIG. 2, occupy a portion of the rowsof the boxes in which the writer will write the required information. Asis apparent from FIG. 2, the printed indications (e.g., “SURNAME”)overlap an imaginary horizontal axis that is positioned at half a heightof the row of boxes. Thus, “NAME” is inscribed in the row of boxes 20 inwhich the surname is to be written, “FIRST NAME” is written in the row22, “ADDRESS” is written in row 24, “ZIP CODE” is written in row 26,“CITY” is written in row 28 and “AGE” is written in the two boxes 30adapted to receive the written age of the writer. One of the two boxes32 and 34 for the sex and comprising respectively “M” (male) and an “F”(female) must be marked with an X.

The directions printed within the boxes are destined, like the boxesthemselves, to be eliminated upon reading by the character recognitiondevice.

One embodiment consists in that the boxes and/or the pre-printedcharacters be formed of elements characterized by one or severalparameters verifying a predetermined relationship which is recognized atthe time of reading the form.

So that the characters written by the writer will stand out in spite ofthe existence of pre-printed characters, it is desirable that theselatter be of a shade sufficiently light that they appear very littlewhen the boxes are filled in by the writer whilst being readable beforefilling in. The boxes of the formula and the characters which arepre-printed in them could for example be printed in a light blue shadeand could ask that the marking by the writer be performed with ablacking pen.

It should be noted that an embodiment falling within the scope of theinvention as has been set forth, is the provision that the pre-printedcharacters within the boxes and/or the boxes themselves be printed witha non-actinic ink and hence be ignored upon reading by the characterrecognition device.

A device for optical recognition used in the framework of the inventionis shown in FIG. 3. The device has a luminous source 40 supplying to thedocument 42 a predetermined illumination. Adjacent the luminous source40 is disposed an electro-optical detector 44 which gathers the lightreflected by the document to the extent as it moves in the direction ofthe arrow. The signals supplied by the detector 44 are then convertedinto digital signals by the analog-digital converter 46. The resultingdigital signals are supplied to a processor (or microprocessor) 48 so asto eliminate the positioning designs, or to store them in a memory 50.

When the process of elimination is carried out by the opto-electricaldetector 44, this latter can act in an analogous manner (which is to sayin the manner of a photocopier). In this case, the designs to be usedfor pre-printing of the form and the pre-printed characters within theboxes must be comprised of elements whose one dimension is less than thethreshold of resolution of the detector. At the output of the detector,the elements of the design not having been read by the detector becauseof its low resolution, will have disappeared and the signals suppliedwill represent only the handwritten characters whose width is generallyquite a bit greater than the threshold of resolution of the detector.

The signals can then be converted into digital signals for ultimateprocessing and character recognition.

The elimination process can be done directly by the electronic detectorof the character recognition device when the detector has a lowresolution, or subsequent to the image detection, by means of digitalprocessing when the detector has a high resolution.

Thus, there can be used simple software when the designs of the boxesand the pre-printed characters in the boxes are constituted of simpleelements, for example thin vertical or oblique lines.

Subsequent treatment, carried out by the processor 48 of the recognitiondevice, then uses software for which the series of bits in which thefirst and the fourth bits are zero, in particularly the sequence 0110corresponding to the elements of the design, are replaced by other 0 bitsequences. The elements of the printed design and the pre-printedcharacters in the boxes will thus be eliminated. Conversely, the seriescomprising at least three consecutive 1 bits of the type 01110corresponding to the image of a stroke of a written character, will notbe eliminated. Such processing permits the elimination of designs whoseelements are characterized by their shape, referring with reference to amatrix as explained in patent application 95.10031.

It should be noted that, although in the above example the referencematrix does not take account of more than one element, it can bearranged to use a reference matrix taking account of two or severalelements constituting the designs and the pre-printed characters withoutdeparting from the scope of the invention.

What is claimed is:
 1. A process for the recognition of writtencharacters inscribed in zones of a document, each zone having a zoneheight, the process comprising the steps of: reading, on the document,inscribed characters written in at least one of said zones andsuperposed on printed directions that overlap a horizontal axispositioned at half the zone height; detecting at least one eliminationparameter characterizing the printed directions; and eliminating theprinted directions based on said at least one elimination parameter. 2.The process of claim 1, in which said elimination parameter comprises apredetermined shape, wherein the step of detecting comprises comparingthe light reflected by an assembly of elemental zones of the documentwith a reference matrix whose elements correspond respectively toelemental zones of the printed directions, so as to eliminate, duringthe eliminating step, the objects entirely contained in the referencematrix, and save only the other objects as being portions of writtencharacters.
 3. The process of claim 1, wherein the step of readingcomprises using non-actinic light.
 4. The process of claim 1, whereinthe step of reading inscribed characters comprises reading directionsconstituted by elements having at least one parameter verifying apredetermined relationship whereas said predetermined relationship isnot verified for written characters.
 5. A process for printing adocument comprising the steps of: providing zones in which to write atleast one character per zone; and printing, in at least one of saidzones, at least one pre-inscribed character supplying printed directionsas to which characters to write in said zones, wherein the printeddirections overlap a horizontal axis positioned at half a height of saidzones so that at least one written character to be inscribed in saidzones overlaps said pre-inscribed character, wherein each writtencharacter is constituted by elements having at least one eliminationparameter permitting elimination of the pre-inscribed character duringreading of the document by an optical recognition device.
 6. The processof claim 5, wherein the printing step comprises using non-actinic ink.7. The process of claim 5, wherein the printing step comprises printingdirections having half the height of the zones which contain them. 8.The process of claim 5, wherein the printing step comprises printingdirections constituted by elements having at least one parameterverifying a predetermined relationship whereas said predeterminedrelationship is not verified for written characters.